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Abstract 

Earthquakes are a complex spatiotemporal phenomenon, the underlying mechanism for which 
is still not fully understood despite decades of research and analysis. We propose and develop a 
network approach to earthquake events. In this network, a node represents a spatial location while 
a link between two nodes represents similar activity patterns in the two different locations. The 
strength of a link is proportional to the strength of the cross-correlation in activities of two nodes 
joined by the link. We apply our network approach to a Japanese earthquake catalog spanning the 
14-year period 1985-1998. We find strong links representing large correlations between patterns 
in locations separated by more than 1000 km, corroborating prior observations that earthquake 
interactions have no characteristic length scale. We find network characteristics not attributable 
to chance alone, including a large number of network links, high node assortativity, and strong 
stability over time. 



I. INTRODUCTION 



Despite the underlying complexities of earthquake dynamics and their complex spatiotem- 
poral behavior [HE], celebrated statistical scaling laws have emerged, describing the number 
of events of a given magnitude (Gutenberg- Richter law) [3], the decaying rate of aftershocks 
after a main event (Omori law) [IHE], the magnitude difference between the main shock and 
its largest aftershock (Bath law) (7] , as well as the fractal spatial occurrence of events [BT-TTT] . 
Recent work has shown that scaling recurrence times according to the above laws results in 
the distribution collapsing onto a single curve [12, [T3]. However, while the fractal occurrence 
of earthquakes incorporates spatial dependence, it appears to embed isotropy in the form of 
radial symmetry, while the occurrence of real- world earthquakes is usually anisotropic [H]. 

To better characterize this anisotropic spatial dependence as it applies to such hetero- 
geneous geography, network approaches have been recently applied to study earthquake 
catalogs [T5H22] . These recent network approaches define links as being between successive 
events, events close in distance [TU], or being between events which have a relatively small 
probability of both occurring based on three of the above statistical scaling laws [22] ■ These 
methods define links between singular events. In contrast, we define links between locations 
based on long-term similarity of earthquake activity. While earlier approaches capture the 
dynamic nature of an earthquake network, they do not incorporate the characteristic proper- 
ties of each particular location along the fault. Various studies have shown |H l2~21 - |2"T] that the 
interval times between earthquake events for localized areas within a catalog have distribu- 
tions not well described by a Poisson distribution [28J, even within aftershock sequences [27J. 
This demonstrates that each area not only has its own statistical characteristics [29], but 
also retains a memory of its events [21H26] . As a result, successive events may not be just the 
result of uncorrelated independent chance but instead might be dependent on the history 
particular to that location. If prediction is to be a goal of earthquake research, it makes 
sense to incorporate interactions due to long-term behavior inherent to a given location, 
rather than by treating each event independently. We include long-term behavior as such 
in this paper by considering a network of locations (nodes) and interactions between them 
(links), where each location is characterized by its long-term activity over several years. 
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II. DATA 



For our analysis, we utilize data from the Japan University Network Earthquake Catalog 



(JUNEC), available online at http://wwweic.eri.u-tokyo.ac.jp/CATALOG/junec/. We 
choose the JUNEC catalog because Japan is among the most active and best observed seismic 
regions in the world. Because our technique is novel, this catalog provided the best avenue 
for employing our analysis. In the future, it may be possible to fine-tune our approach to 
more sparse catalogs. 

The data in the JUNEC catalog span 14 years from 1 July 1985 - 31 December 1998 
and are depicted in Fig. [TJ Each entry in the catalog includes the date, time, magnitude, 
latitude, and longitude of the event. We found the catalog to obey the Gutenberg-Richter 
law [30J for events of magnitude 2.2 or larger. By convention, this is taken to mean that the 
catalog can be assumed to be complete in that magnitude range. However, because catalog 
completeness cannot be guaranteed for shorter time periods over a 14-year span, we also 
examine Gutenberg-Richter statistics for each non-overlapping two-year period (Fig. |2| [30J . 
We find that, though absolute activity varies by year, the relative occurrences of quakes of 
varying magnitudes does not change significantly for events between magnitude 2.2 and 5, 
where there is the greatest danger of events missing from the catalog. 

Additionally, the data are spatially heterogeneous, as shown in Fig. [TJ Most events take 
place either over land or off Japan's east coast. We remark to the reader that this is not an 
artifact of more detection equipment being located on land. The primary means for locating 
and detecting earthquake events involves using the S-waves and P-waves that emanate from 
the events. Seismic stations are capable of detecting these waves a great distance from 
their source. Both S-waves and P-waves [31] travel through the Earth's mantle, and the 
characteristic absorption distance, defined as the distance for wave amplitude to drop to 
1/e of its original value, for body waves is on the order of 10,000 km [32J. Any event of 
magnitude 5.5 or larger, for example, is detectable anywhere on earth. Hence, the location of 
the detection equipment does not affect how accurately events are catalogued. Additionally, 
because the location of the Japanese archipelago is a consequence of seismic activity involving 
the Philippine and other tectonic plates, it is not surprising that most seismic events take 
place on or near the islands themselves. 
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III. METHOD 



We partition the region associated with the JUNEC catalog as follows: we take the 
northernmost, southernmost, easternmost, and westernmost extrema of all events in the 
catalog as the spatial bounds for our analysis. We partition this region into a 23 x 23 
grid which is evenly spaced in geographic coordinates. Each grid square of approximate size 
100 km x 100 km is regarded as a possible node in our network. Results do not qualitatively 
differ when the fineness of the spatial grid is modified, in agreement with analogous work 
carried out by Ref. [20J, using a different technique from ours [IB]. However, 100 km boxes 
are a more physical choice, as 100 km is on the order of rupture length associated with 
earthquakes [33], which in turn is roughly equivalent to the aftershock zone distance for 
larger earthquakes [34] . 

For a given measurement at time t, an event of magnitude M occurs inside a given grid 
square. Similar to the method of Corral [27], we define the signal of a given grid square to 
form a time series {s t }, where each series term s t is related to the earthquake activity that 
takes place inside that grid square within the time window At, as described below. 

Because events do not generally occur on a daily basis in a given grid square, it is 
necessary to bin the data to some level of coarseness. How coarse the data are treated 
involves a trade-off between precision and data richness. 

We define the best results as those corresponding to the most prominent cross- 
correlations. To this end, we choose 90 days as the coarseness for our time series. This 
choice means that st will cover a time window of At = 90 days and St+i will cover the 90- 
day non-intersecting time period immediately following, giving approximately 4 increments 
per year. Additional analysis shows that results do not qualitatively differ by changing the 
time coarseness. 

We refer to the time series {s t } belonging to each grid cell ij as that grid cell's signal. 
We define the signal that is related to the energy released in the the ij grid cell by 



where N t (ij) denotes the number of events that occur in tth time window in grid square ij. 
We choose this definition because the term 10 i M is proportional to the energy released from 
an earthquake of magnitude M [35]. The signal therefore is proportional to the total energy 




N t (ij) 




(1) 
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released at a given location in a 90-day time period |36j . 

To define a link between two grid squares, we calculate the Pearson product-moment 
correlation coefficient r x>y between the two time series {x t }, {yt} associated with those two 
grid squares [ID] 

s (xr) -WOO (2) 

a x o y 

where (...) indicates the mean and a x , a y the standard deviations of the time series {x t }, {yt}- 
We consider the two grid squares linked if r x>y is larger than a specified threshold value 
r c , where r c is a tunable parameter. As is standard in network-related analysis, we define 
the degree k of a node to be the number of links the node has. Note that our signal 
definition Eq. [I] involves an exponentiation of numbers of order 1. This means that the 
energy released, and therefore the cross-correlation between two signals, is dominated by 
large events. Examples of signals with high correlation are shown in Fig. [3j 

To confirm the statistical significance of r XjV , we compare r x>y of any two given signals 
with r XiV calculated by shuffling one of the signals. We also compare r Xjy with the cross- 
correlation f Xj y(r) we obtain by time-shifting one of the signals by varying time increments 
t, 

fx,y( T ) = r (s x ,t, s y,t+r)> (3) 

where r is in units of 90 days. Further, we impose periodic boundaries 

t + r = (t + r) mod t max , (4) 

where t max is the length of the series. Our justification for these boundaries is that events 
in the distant past (>10 years) should have nominal effects on the present, while they also 
provide typical background noise for comparison. 

We note that over 14-year time period 1985-1998, the overall observed activity increases 
in the areas covered by the catalog. To ensure that the r x>y values we calculate are not 
simply the result of trends in the data, we compare our results to those obtained with 
linearly detrended data [37]. We find that the trends do not have a significant effect. For 
example, using r c = 0.7, we obtain 815 links, while detrending the data results in only 3 
links dropping below the threshold correlation value. For r c = 0.6, we obtain 1003 links, 
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while detrending results in only 3 links dropped. Additionally, after detrending, 94% of 
correlation values stay within 2% of their values. 

IV. RESULTS 

As described above, we compare r-^O) = r xy of Eq. [3] between signals at different 
locations at the same point in time with f xy {r) and with with correlation coefficient obtained 
by shuffling one of the series. Shuffling or time-shifting by a single time step (representing 
90 days) reduces f XiV to within the margin of significance, as shown in Fig. |4j Shuffling 
the signal also reduce s We find a large number of links with cross- correlations far larger 
than their shuffled counterparts. The number of links exceeds that of time-shuffled data by 
roughly 3a-8a, depending on choice of r c as shown in Fig. [5] (a). However, as shown, there 
are still many links that can be regarded as the result of noise. We therefore further examine 
the difference between the number of links found in time-shuffled data and the number found 
in the original data (Fig. [5] (b)). We find that the fraction of "real" links in general increases 
with r c . 

A significant fraction of these links connect nodes farther apart than 1000 km, as can be 
seen in Fig. [6] This is consistent with the finding that there is no characteristic cut-off length 
for interactions between events [2"0"l 123] , corroborated by Fig. [7J showing the number of links 
a network has at a given distance as a fraction of the number of links that are possible from 
choosing any two nodes in the potential network. Distances shorter than 100 km have sparse 
statistics due to the coarseness of the grid while distances greater than 2300 km have sparse 
statistics due to the finite spatial extent of the catalog. Within this range, the fraction of 
links observed drops off approximately no faster than a power law. We find qualitatively 
similar results when we adjust the grid coarseness. 

Our results, shown in Fig. |6j are anisotropic, with the majority of links occurring at ap- 
proximately 37.5 degrees east of north. This is roughly along the principal axis of Honshu, 
Japan's main island, and parallel to the highly active fault zone formed by the subduction 
of the Philippine and Pacific tectonic plates under the Amurian and Okhotsk plates respec- 
tively. High degree nodes (i.e. nodes with a large number of links) tend to be found in 
the northeast and northcentral regions of the JUNEC catalog and are notably not strongly 
associated with the locations in the catalog that are most active, which we discuss in further 
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detail below. 

In network physics, we often characterize networks by the preference for high-degree 
nodes to connect to other high-degree nodes. The strength of this preference is quantified 
by the network's assortativity, defined as 

A = r klM , (5) 

where r is the Pearson correlation coefficient given by Eq. g. The series k\ and &2 are 
found as follows: iterating through all entries i,j in the adjacency matrix [38J, the degree 
of each node i is appended to the series {ki} and the degree of the node j that % is linked 
to is appended to the series {k 2 }. The assortativity coefficient thus gives a correlation of 
node degree within the network. If each node of degree k connects only to nodes of the same 
degree, the two series {fci} and {k 2 } will be identical and A=l. Networks like the network 
of paper coauthorship have positive assortativity, while those of the World-Wide Web and 
of many ecological and biological systems have negative assortativity [39] . 

Fig. [8] shows that the networks resulting from our procedure are highly assortative with 
assortativity generally increasing with r c . The finding of positive correlation between the 
degree of a node and the degree of its neighbors is consistent with an analogous finding [2D] 
with Iranian data, using a different technique from ours [18J. For comparison we show the 
assortativity obtained by using time shuffled networks. Since assortativity of the original 
networks is far higher than those of shuffled systems, the high assortativity cannot be due 
to a finite size effect or to the spatial clustering displayed in the data, since time shuffling 
preserves location. We investigate the nature of the high-degree nodes and find that high 
degree is not a matter of more events being nearby, as there is a slight tendency for higher 
degree nodes to actually have longer distance links on average than low degree nodes. Addi- 
tionally, we found that node degree is essentially independent of both maximum earthquake 
size and number of events. 

Because Fig. [5] shows, as mentioned above, that many links can be regarded as the result 
of noise, we investigate the stability of links over time (Fig. [9]). Similarity of the network 
between the first seven years (1985-1992) and the second seven years (1992-1998) in the 
catalog is found as follows. We find the set of links that satisfy r > r c in both the 1985- 
1992 network and the 1992-1998 network, and create a series out of the respective link 
strengths (correlations) in the 1985-1992 network. We create another series using the same 
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links, now using the corresponding strengths from the 1992-1998 network. We then correlate 
the two series using the Pearson correlation coefficient given by Eq. g. We find that the 
network is far more stable over time than counterpart results given by shuffling the time 
series (Fig. [9]). Because one would expect large correlations that arise purely from noise to 
have no "memory" from one time period to another, the finding of network stability over 
several years is consistent with our result that these links are not simply the result of chance. 

V. DISCUSSION AND CONCLUSIONS 

To summarize our results, we have introduced a novel method for analyzing earthquake 
activity through the use of networks |JT] • The resulting networks (i) display links with no 
characteristic length scale, (ii) display far more links than expected from chance alone, (iii) 
are far more assortative, and (iv) display significantly more link stability over time. The 
lack of a characteristic length scale is consistent with previous work and underscores the 
difficulty in making accurate predictions. The statistically significant nature of all of these 
results is consistent with the possibility of the presence of hidden information in a catalog, 
not captured by existing models or previous earthquake network approaches. 

We thank K. Yamasaki for useful discussions, and the DTRA, ONR, European EPI- 
WORK and LINC projects, and the Israel Science Foundation for financial support. 
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I I 

FIG. 1: (Color online) Number of events by location in the JUNEC catalog, shown in a 23 x 23 
mesh. Larger circles with brighter colors denote more events. The JUNEC catalog clusters spatially, 
with most activity occurring on the eastern side of Honshu, Japan's main island. 
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FIG. 2: (Color online) Demonstrating that the magnitude above which the Gutenberg-Richter law 
is obeyed is approximately constant from year to year. To this end, we provide Gutenberg-Richter 
statistics for the JUNEC catalog over separated 2-year periods. The Gutenberg-Richter law states 
that the number N of events greater than a given magnitude M obeys log N = a — bM, with b ~ 1. 
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FIG. 3: (Color online) Examples of highly correlated signals, as defined in Eq. ([!]), with values of 
marked above: (a) Two signals with Pearson correlation coefficient r = 0.9617, associated with 
locations 878 km apart, (b) the corresponding r as a function of time offset as defined by Eq. |3j (c) 
Corresponding scatterplot of (a) with signal = (10, 12) plotted against signal = (2, 14). 
Each point corresponds to a single point in time for the simultaneous signals of (10, 12) and (2, 14). 
Note that because the signal is defined in terms of exponentiation that large events dominate the 
correlation, just as large events dominate the total energy released in an earthquake catalog. 
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FIG. 4: (Color online) Testing the statistical significance of cross-correlations to demonstrate 
that correlations observed are stronger than ambient noise. For each pair of signals with a cross- 
correlation r > r c , we shift one of the signals in time and calculate the new correlation coefficient. 
Each colored line is a comparison of a pair of signals, as described by Eq. [3} Note the strong peak 
at t = corresponding to signals being compared at the same time. Offsetting the signals in time 
results in lower cross-correlation, dropping to the level of noise in the actual data. As a control, we 
shuffle the signals and calculate the cross-correlation for different time shifts (shown below each 
figure). Cross-correlation between various pairs of signals vs. time offset. Shown are links for which 
(a) f(0) > r c = 0.7 and (b) r(0) > r c = 0.9. 
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FIG. 5: (Color online) Demonstration that empirical data show far more links than time shuffled 
data, (a) In black is the distribution of the number of links obtained in the network after time 
shuffling the data many times. A link corresponds to a correlation coefficient between two signals 
of r > r c . Shown is the case r c = 0.8. Actual results, shown in red (color online), are greater than 
5cr from the mean of the shuffled distribution, about 17% more links than the mean of the shuffled 
distribution, (b) Results are similar for other values of r c . We note that the fraction of links we 
can regard "real" or meaningful in general increases with r c . 
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FIG. 6: (Color online) Network links superimposed on a map of the Japanese archipelago, including 
Japan's main island Honshu. Note that links are anisotropic and primarily lie parallel to the 
principal axis of Honshu. Shown are links satisfying r > r c that are connected to high-degree nodes 
(k > k m i n ). Darker colors (red online) indicate stronger links (i.e. stronger cross-correlations). Links 
shown satisfy (a) r c = 0.9, k min = 5, (b) r c = 0.8, k min = 7, (c) r c = 0.7, k min = 8, (d)r c = 0.5, 
kmin = 8. These choices for r c and k m i n give approximately 70, 70, 90, and 90 links respectively. 
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FIG. 7: (Color online) Demonstration that links have no characteristic length scale. To this end, we 
show the number of network links at a given distance as a fraction of how many links are possible 
at that distance from choosing any pairs of nodes. Distances less than 100 km have sparse statistics 
due to the coarseness of the spatial grid, while distances greater than 2300 km have sparse statistics 
due to the finite spatial extent of the catalog. 
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FIG. 8: (Color online) Demonstration that earthquake networks are highly assortative (see Eq. (j5j)) 
for a wide range of r c , with assortativity A generally increasing with r c . A > indicates that 
high-degree nodes tend to link to high-degree nodes and low-degree nodes tend to link to low- 
degree nodes. For comparison assortativity values obtained from networks using time-shuffled data 
demonstrate that these findings are neither a finite-size effect nor a result of spatial clustering, 
since time-shuffling preserves location. 
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FIG. 9: (Color online) Correlation networks display stability over time. Shown is the similarity of 
the 1985-1992 network with the 1992-1998 network. Similarity is obtained by (i) selecting the set 
of links that satisfy r > r c in both networks, (ii) making one series out of the strengths (cross- 
correlation) in the 1985-1992 network and creating another series out of the corresponding strengths 
in the 1992-1998 network and (iii) correlating the two series using the Pearson cross-correlation 
coefficient given by Eq. Q. 
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